Rank | Count | Beginning |
---|---|---|
1681 | 1169 | ई |
3595 | 509 | ओ |
6072 | 341 | नेपालक |
3050 | 221 | एकर |
5097 | 213 | जाहिमे |
610 | 209 | २०६८ |
9012 | 187 | सन् |
3431 | 91 | एहि |
9846 | 82 | हुनकर |
2910 | 45 | उनकर |
7310 | 42 | भारतक |
3011 | 41 | एक |
3336 | 37 | एतय |
5374 | 32 | जे |
3273 | 31 | एकरा |
7781 | 31 | मुदा |
1182 | 28 | अपन |
1565 | 27 | इ |
9929 | 27 | हुनका |
3510 | 26 | एहिमे |
8931 | 26 | स. |
9720 | 23 | हाल |
7883 | 21 | मैथिली |
6065 | 20 | नेपाल |
8033 | 20 | याह |
8890 | 20 | श्री |
1419 | 19 | आ |
6632 | 19 | पहिल |
9806 | 19 | हिन्दू |
6441 | 18 | नेपाली |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV